Improved spelling recognition using a tree-based fast lexical match

نویسندگان

Carl D. Mitchell

Anand R. Setlur

چکیده

This paper addresses the problem of selecting a name from a very large list using spelling recognition. In order to greatly reduce the computational resources required, we propose a tree-based lexical fast match scheme to select a short list of candidate names. Our system consists of a free letter recognizer, a fast matcher, and a rescoring stage. The letter recognizer uses n-grams to generate an n-best list of letter hypotheses. The fast matcher is a tree that is based on confusion classes, where a confusion class is a group of acoustically similar letters such as the e-set. The fast matcher reduces over 100,000 unique last names to tens or hundreds of candidates. Then the rescoring stage picks the best name using either letter alignment or a constrained grammar. The fast matcher retained the correct name 99.6% of the time and the system retrieved the correct name 97.6% of the time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexical orthography acquisition: Is handwriting better than spelling aloud?

Lexical orthography acquisition is currently described as the building of links between the visual forms and the auditory forms of whole words. However, a growing body of data suggests that a motor component could further be involved in orthographic acquisition. A few studies support the idea that reading plus handwriting is a better lexical orthographic learning situation than reading alone. H...

متن کامل

Design and implementation of Persian spelling detection and correction system based on Semantic

Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...

متن کامل

Speed improvement of the time-asynchronous acoustic fast match

This paper describes an algorithm for improvement of the speed of a time-asynchronous fast match, which is a part of a stack-search based recognition system. This fast match uses a phonetic tree to represent the entire vocabulary of the recognizer. Evaluation of the tree (in a depthrst manner), can be done much more e ciently using the fact that under certain conditions, the results of branch e...

متن کامل

SimSem: Fast Approximate String Matching in Relation to Semantic Category Disambiguation

In this study we investigate the merits of fast approximate string matching to address challenges relating to spelling variants and to utilise large-scale lexical resources for semantic class disambiguation. We integrate string matching results into machine learning-based disambiguation through the use of a novel set of features that represent the distance of a given textual span to the closest...

متن کامل

Spelling consistency affects reading in young Dutch readers with and without dyslexia.

Lexical-decision studies with experienced English and French readers have shown that visual-word identification is not only affected by pronunciation inconsistency of a word (i.e., multiple ways to pronounce a spelling body), but also by spelling inconsistency (i.e., multiple ways to spell a pronunciation rime). The aim of this study was to compare the reading behavior of young Dutch readers wi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Improved spelling recognition using a tree-based fast lexical match

نویسندگان

چکیده

منابع مشابه

Lexical orthography acquisition: Is handwriting better than spelling aloud?

Design and implementation of Persian spelling detection and correction system based on Semantic

Speed improvement of the time-asynchronous acoustic fast match

SimSem: Fast Approximate String Matching in Relation to Semantic Category Disambiguation

Spelling consistency affects reading in young Dutch readers with and without dyslexia.

عنوان ژورنال:

اشتراک گذاری